110 research outputs found

    Data Pipeline Management in Practice: Challenges and Opportunities

    Get PDF
    Data pipelines involve a complex chain of interconnected activities that starts with a data source and ends in a data sink. Data pipelines are important for data-driven organizations since a data pipeline can process data in multiple formats from distributed data sources with minimal human intervention, accelerate data life cycle activities, and enhance productivity in data-driven enterprises. However, there are challenges and opportunities in implementing data pipelines but practical industry experiences are seldom reported. The findings of this study are derived by conducting a qualitative multiple-case study and interviews with the representatives of three companies. The challenges include data quality issues, infrastructure maintenance problems, and organizational barriers. On the other hand, data pipelines are implemented to enable traceability, fault-tolerance, and reduce human errors through maximizing automation thereby producing high-quality data. Based on multiple-case study research with five use cases from three case companies, this paper identifies the key challenges and benefits associated with the implementation and use of data pipelines

    Building robust prediction models for defective sensor data using Artificial Neural Networks

    Get PDF
    Predicting the health of components in complex dynamic systems such as an automobile poses numerous challenges. The primary aim of such predictive systems is to use the high-dimensional data acquired from different sensors and predict the state-of-health of a particular component, e.g., brake pad. The classical approach involves selecting a smaller set of relevant sensor signals using feature selection and using them to train a machine learning algorithm. However, this fails to address two prominent problems: (1) sensors are susceptible to failure when exposed to extreme conditions over a long periods of time; (2) sensors are electrical devices that can be affected by noise or electrical interference. Using the failed and noisy sensor signals as inputs largely reduce the prediction accuracy. To tackle this problem, it is advantageous to use the information from all sensor signals, so that the failure of one sensor can be compensated by another. In this work, we propose an Artificial Neural Network (ANN) based framework to exploit the information from a large number of signals. Secondly, our framework introduces a data augmentation approach to perform accurate predictions in spite of noisy signals. The plausibility of our framework is validated on real life industrial application from Robert Bosch GmbH.Comment: 16 pages, 7 figures. Currently under review. This research has obtained funding from the Electronic Components and Systems for European Leadership (ECSEL) Joint Undertaking, the framework programme for research and innovation Horizon 2020 (2014-2020) under grant agreement number 662189-MANTIS-2014-

    A process pattern model for tackling and improving big data quality

    Get PDF
    Data seldom create value by themselves. They need to be linked and combined from multiple sources, which can often come with variable data quality. The task of improving data quality is a recurring challenge. In this paper, we use a case study of a large telecom company to develop a generic process pattern model for improving data quality. The process pattern model is defined as a proven series of activities, aimed at improving the data quality given a certain context, a particular objective, and a specific set of initial conditions. Four different patterns are derived to deal with the variations in data quality of datasets. Instead of having to find the way to improve the quality of big data for each situation, the process model provides data users with generic patterns, which can be used as a reference model to improve big data quality

    Subcellular localization of type-I thionins in the endosperms of wheat and barley.

    Get PDF
    Thionins are cysteine-rich polypeptides of about 5,000 Da. Localization at the subcellular level of type I endosperm thionins has been carried out by immunogold labeling, using an antibody that recognizes type I thionin variants. In developing wheat and barley caryopses, sectioned at different times between 13 and 24 days after flowering, this type of thionins was only detected around protein bodies from cells of the starchy endosperm, using light microscopy. Electron microscopy revealed that these proteins were located in electron-dense spheroids in the periphery of protein bodies, at the earlier stages, whereas later the label appeared also as a thin layer around these organelles

    An Integrated Approach to the Prediction of Chemotherapeutic Response in Patients with Breast Cancer

    Get PDF
    BACKGROUND: A major challenge in oncology is the selection of the most effective chemotherapeutic agents for individual patients, while the administration of ineffective chemotherapy increases mortality and decreases quality of life in cancer patients. This emphasizes the need to evaluate every patient's probability of responding to each chemotherapeutic agent and limiting the agents used to those most likely to be effective. METHODS AND RESULTS: Using gene expression data on the NCI-60 and corresponding drug sensitivity, mRNA and microRNA profiles were developed representing sensitivity to individual chemotherapeutic agents. The mRNA signatures were tested in an independent cohort of 133 breast cancer patients treated with the TFAC (paclitaxel, 5-fluorouracil, adriamycin, and cyclophosphamide) chemotherapy regimen. To further dissect the biology of resistance, we applied signatures of oncogenic pathway activation and performed hierarchical clustering. We then used mRNA signatures of chemotherapy sensitivity to identify alternative therapeutics for patients resistant to TFAC. Profiles from mRNA and microRNA expression data represent distinct biologic mechanisms of resistance to common cytotoxic agents. The individual mRNA signatures were validated in an independent dataset of breast tumors (P = 0.002, NPV = 82%). When the accuracy of the signatures was analyzed based on molecular variables, the predictive ability was found to be greater in basal-like than non basal-like patients (P = 0.03 and P = 0.06). Samples from patients with co-activated Myc and E2F represented the cohort with the lowest percentage (8%) of responders. Using mRNA signatures of sensitivity to other cytotoxic agents, we predict that TFAC non-responders are more likely to be sensitive to docetaxel (P = 0.04), representing a viable alternative therapy. CONCLUSIONS: Our results suggest that the optimal strategy for chemotherapy sensitivity prediction integrates molecular variables such as ER and HER2 status with corresponding microRNA and mRNA expression profiles. Importantly, we also present evidence to support the concept that analysis of molecular variables can present a rational strategy to identifying alternative therapeutic opportunities

    Sustainable Urban Systems: Co-design and Framing for Transformation

    Get PDF
    Rapid urbanisation generates risks and opportunities for sustainable development. Urban policy and decision makers are challenged by the complexity of cities as social–ecological–technical systems. Consequently there is an increasing need for collaborative knowledge development that supports a whole-of-system view, and transformational change at multiple scales. Such holistic urban approaches are rare in practice. A co-design process involving researchers, practitioners and other stakeholders, has progressed such an approach in the Australian context, aiming to also contribute to international knowledge development and sharing. This process has generated three outputs: (1) a shared framework to support more systematic knowledge development and use, (2) identification of barriers that create a gap between stated urban goals and actual practice, and (3) identification of strategic focal areas to address this gap. Developing integrated strategies at broader urban scales is seen as the most pressing need. The knowledge framework adopts a systems perspective that incorporates the many urban trade-offs and synergies revealed by a systems view. Broader implications are drawn for policy and decision makers, for researchers and for a shared forward agenda

    JCMT BISTRO Observations: Magnetic Field Morphology of Bubbles Associated with NGC 6334

    Get PDF
    We study the Hii regions associated with the NGC 6334 molecular cloud observed in the submillimeter and taken as part of the B-fields In STar-forming Region Observations Survey. In particular, we investigate the polarization patterns and magnetic field morphologies associated with these Hii regions. Through polarization pattern and pressure calculation analyses, several of these bubbles indicate that the gas and magnetic field lines have been pushed away from the bubble, toward an almost tangential (to the bubble) magnetic field morphology. In the densest part of NGC 6334, where the magnetic field morphology is similar to an hourglass, the polarization observations do not exhibit observable impact from Hii regions. We detect two nested radial polarization patterns in a bubble to the south of NGC 6334 that correspond to the previously observed bipolar structure in this bubble. Finally, using the results of this study, we present steps (incorporating computer vision; circular Hough transform) that can be used in future studies to identify bubbles that have physically impacted magnetic field lines
    corecore